Non-linear frequency scale mapping for voice conversion in text-to-speech system with cepstral description
نویسندگان
چکیده
Voice conversion, i.e. modification of a speech signal to sound as if spoken by a different speaker, finds its use in speech synthesis with a new voice without necessity of a new database. This paper introduces two new simple non-linear methods of frequency scale mapping for transformation of voice characteristics between male and female or childish. The frequency scale mapping methods were developed primarily for use in the Czech and Slovak text-to-speech (TTS) system designed for the blind and based on the Pocket PC device platform. It uses cepstral description of the diphone speech inventory of the male speaker using the source-filter speech model or the harmonic speech model. Three new diphone speech inventories corresponding to female, childish and young male voices are created from the original male speech inventory. Listening tests are used for evaluation of voice transformation and quality of synthetic speech. 2006 Elsevier B.V. All rights reserved.
منابع مشابه
Voice-based Age and Gender Recognition using Training Generative Sparse Model
Abstract: Gender recognition and age detection are important problems in telephone speech processing to investigate the identity of an individual using voice characteristics. In this paper a new gender and age recognition system is introduced based on generative incoherent models learned using sparse non-negative matrix factorization and atom correction post-processing method. Similar to genera...
متن کاملEmotional Style Conversion in the TTS System with Cepstral Description
This contribution describes experiments with emotional style conversion performed on the utterances produced by the Czech and Slovak textto-speech (TTS) system with cepstral description and basic prosody generated by rules. Emotional style conversion was realized as post-processing of the TTS output speech signal, and as a real-time implementation into the system. Emotional style prototypes rep...
متن کاملThe Study of Vocal Function in Patients With Early Laryngeal Carcinoma After Transoral Laser Microsurgery
Objective Today transoral laser microsurgery is considered as one of the first options to control early laryngeal cancer, and voice disorder is one of the inevitable complications of this therapeutic component. This study aimed to compare the vocal function in patients with early-stage laryngeal cancer following laser surgery with healthy individuals with normal voice quality using acoustic ana...
متن کاملUsing Context-based Statistical Models to Promote the Quality of Voice Conversion Systems
This article aims to examine methods of optimizing GMM-based voice conversion systems performance in which GMM method is introduced as the basic method for improvement of voice conversion systems performance. In the current methods, due to using a single conversion function to convert all speech units and subsequent spectral smoothing arising from statistical averaging, we will observe quality ...
متن کاملطراحی یک روش آموزش ناموازی جدید برای تبدیل گفتار با عملکردی بهتر از آموزش موازی
Introduction: The art of voice mimicking by computers, has with the computer have been one of the most challenging topics of speech processing in recent years. The system of voice conversion has two sides. In one side, the speaker is the source that his or her voice has been changed for mimicking the target speaker’s voice (which is on the other side). Two methods of p...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Speech Communication
دوره 48 شماره
صفحات -
تاریخ انتشار 2006